-
Notifications
You must be signed in to change notification settings - Fork 13.9k
server : better default prompt #2646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I noticed this a few times as well and thought we should make the reverse-prompt case-insensitive to help against this problem. |
|
Btw, lately I also see many complaints in the issues about interactive mode behaving weirdly. We might have broken something along the way, I don't have much time to dig into that atm though |
|
I've been wondering this. I've noticed prompt misbehaving more particularly with llama2-derived models, and for me it only happens in interactive mode + prompt + reverse-prompt. I remember that it started to misbehave some time after commit 8a88e58 because I kept that one as a good one. I've then been using commit 8183159 with the simplified --in-prefix containing the reverse-prompt, and started to have some doubt around commit 25d43e0. The thing is that the llama2 models arrived during this range and I can't say how much was due to the models or the code. The range 8a88e58..25d43e0 is not wide, particularly if we exclude CUDA/vim/server/metal changes, we're left with roughly 20. The problem is that the issue has not been systematic enough to make this bisectable for now, and I'm not even 100% certain it didn't start a bit before or after. We'd need to find a prompt and initial seed that reproducibly triggers it. I'm as well too short of time to try this but eventually I might try it if nobody beats me at it. |
|
Actually I might have caught one such case even with 8a88e58: The first occurrence of the prompt was missing after "how can I help you" then it appeared at the end of lines. It's still the same with 1f0bccb. I'm pasting a screenshot in color mode which makes the problem more visible: With commit 1d16309 it was not the case, and by then a line-feed was sent before the prompt: I'm trying to bisect it now. |
|
OK, I just found the fault one: At first glance it didn't look related, but look precisely: See ? It changed the default value from 1e-6f to 5e-6f. I have no idea what it's used for, but I could confirm that changing it on the command line restores a good behavior. Latest master now shows the prompt correctly if I add I never understood what this eps was used for, but it definitely has an impact here. Unfortunately there's no info in the commit message about what was the rationale for that change. Anyway if it happens that this magic value is sufficient to fix the various prompt issues, it's no big deal to add it to scripts, once it's known. Also I don't know if it's the same issues others have noticed. |
|
To clarify, I've used exactly this command line: |
|
The default |
|
It would be bad to remove it from the command-line if it currently allows users to fix it when the default one doesn't work well. Here for me the default value for llama2 works very poorly, and reverting to 1e-6 gives a better behavior, so I'd like to keep the ability to force it if an unusable value is hard-coded into the model. |
|
What I meant is that this value is model-specific. After the GGUF change, llama1 models will use |
|
OK but it doesn't cost anything to leave it fixable by end-users via the command-line if they observe a different behavior. I've had 1e-5 set in a few scripts after finding it in a comment somewhere, but at least now that I know that the prompt issues can be related to this value, if I meet trouble again in the future I'll know I can try to change the value and report about my findings. If you remove the command-line option I won't have this option anymore. |
|
It's a model hyperparameter, there is no reason to change it. It was only added to the command line temporarily because the llama2 models use a different value. |
|
If you're absolutely certain that nobody needs to fix it I get your point. But the commit above was possibly made by people who were also certain that their value was correct, and they made the models poorly usable, and this value they chose by then is still the one currently used. What I don't get is what it costs to keep the command-line option just to force it. I mean, it's just: Like most users I'm loading working models from TheBloke and experimenting with a few settings to try to make them behave well, I'm unable (and really not willing) to modify these files. And requiring everyone to rebuild all models after there is a consensus that finally 1e-5 is still not sufficient and needs to be increased wouldn't be great. However I totally support the principle of shipping the default value in the model. |
|
OK I'm still having a prompt issue that I can reproduce after several exchanges with the Vigogne model at both eps 1e-5 and 1e-6 and that doesn't happen with the old code above. I'll restart the bisect, it's possible that this time we'll find something more closely related to a prompt issue. |
|
@ggerganov now I could bisect a prompt issue that doesn't depend on the eps value and found this one: When reading the patch I'm pretty sure I've read about it already in another issue somewhere. I recognize the block of code that moved. Of course, passing the new option on the command line doesn't fix the problem at all, so it's a real regression. The reproducer I've found is the following: I enter |
|
@ggerganov I understand your time's limited, so I've gathered info. into a Poll that shows an unintentional effect of the Currently, master llama.cpp is RUTHLESS in it's requirement to precisely follow a prompt template. The most blatant example is with vicuna-7b-v1.5. Previously I used, Following jxy Instructions, here's an example of failure: #2507 (comment). Here's another example: The content of Vic.txt is: And as you can see, I didn't get a chance to type at all. Changing Thank you. |
|
Ok, I think I see the problem. It comes from keeping EOS token. Ignoring it is not the same as before the BOS commit because there is no new line. I'm on a phone so cannot push a fix. likely on monday or tuesday |
|
That's a great news if you figured it out. I often say that an identified problem is half-solved :-) If you think the fix is simple enough to be explained, I can give it a try and offload you from that. Otherwise I'll happily test your fix once you have one. |




With the default
serversettings onmaster, the bot continues to generate the user's response due to capitalization differences between the prompt and the username. Here is the result from just sending "Hi" as input:With the proposed prompt, it behaves better.
However, it often generates the string "\end{code}":
Probably we need a few shot examples in the prompt to drive it in the correct "dialogue"?
This is using vanilla LLaMA v1 7B